26 research outputs found

    Direction Specific Ambisonics Source Separation with End-To-End Deep Learning

    Full text link
    Ambisonics is a scene-based spatial audio format that has several useful features compared to object-based formats, such as efficient whole scene rotation and versatility. However, it does not provide direct access to the individual source signals, so that these have to be separated from the mixture when required. Typically, this is done with linear spherical harmonics (SH) beamforming. In this paper, we explore deep-learning-based source separation on static Ambisonics mixtures. In contrast to most source separation approaches, which separate a fixed number of sources of specific sound types, we focus on separating arbitrary sound from specific directions. Specifically, we propose three operating modes that combine a source separation neural network with SH beamforming: refinement, implicit, and mixed mode. We show that a neural network can implicitly associate conditioning directions with the spatial information contained in the Ambisonics scene to extract specific sources. We evaluate the performance of the three proposed approaches and compare them to SH beamforming on musical mixtures generated with the musdb18 dataset, as well as with mixtures generated with the FUSS dataset for universal source separation, under both anechoic and room conditions. Results show that the proposed approaches offer improved separation performance and spatial selectivity compared to conventional SH beamforming.Comment: To be published in Acta Acustica. Code and listening examples: https://github.com/francesclluis/direction-ambisonics-source-separatio

    Predicting perceptual transparency of head-worn devices

    Get PDF
    | openaire: EC/H2020/812719/EU//VRACEAcoustically transparent head-worn devices are a key component of auditory augmented reality systems, in which both real and virtual sound sources are presented to a listener simultaneously. Head-worn devices can exhibit high transparency simply through their physical design but in practice will always obstruct the sound field to some extent. In this study, a method for predicting the perceptual transparency of head-worn devices is presented using numerical analysis of device measurements, testing both coloration and localization in the horizontal and median plane. Firstly, listening experiments are conducted to assess perceived coloration and localization impairments. Secondly, head-related transfer functions of a dummy head wearing the head-worn devices are measured, and auditory models are used to numerically quantify the introduced perceptual effects. The results show that the tested auditory models are capable of predicting perceptual transparency and are therefore robust in applications that they were not initially designed for.Peer reviewe

    Transfer-Plausibility of Binaural Rendering with Different Real-World References

    Get PDF
    For the evaluation of virtual acoustics for mixed realities, we distinguish between the paradigms "authenticity", "plausibility" and "transfer-plausibility". In the case of authenticity, discrimination tasks between real sound sources and virtual renderings presented over headphones are performed, whereas in case of a plausibility experiment, listeners need to rely only on their expectation of a sound when listening to the rendering, without the presence of an explicit reference. In the case of transfer-plausibility, however, different real sources are active alongside virtual sources, potentially in different spatial locations, leading to a certain degree of comparability. This resembles the case of forthcoming augmented reality systems. Here, we show an experiment, which assesses the transfer-plausibility of rendered speech sources in a variable acoustic environment. We demonstrate the influence of the similarity between real and virtual source material and their spatial location on the transfer-plausibility of measurement-based headphone rendering.Non peer reviewe

    Blind Directional Room Impulse Response Parameterization from Relative Transfer Functions

    No full text
    Funding Information: This research has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie SkƂodowska-Curie grant agreement No. 812719. Publisher Copyright: © 2022 IEEE.Acquiring information about an acoustic environment without conducting dedicated measurements is an important problem of forthcoming augmented reality applications, in which real and virtual sound sources are combined. We propose a straightforward method for estimating directional room impulse responses from running signals. We adaptively identify relative transfer functions between the output of a beam-former pointing into the direction of a single active sound source and the complete set of spherical harmonics domain signals, representing all directions. To this end, estimation is performed with a frequency domain recursive least squares algorithm. Then, parameters such as the directions of arrival of early reflections and the reverberation time are extracted. Estimation of the direct-to-reverberant ratio requires dedicated processing. We show examples of successful estimation from speech signals, based on a simulated and a measured response.Peer reviewe

    Assessing Room Acoustic Memory using a Yes/No and a 2-AFC Paradigm

    No full text
    We present a study that tests the ability to remember room acoustics-a cognitive skill that is one of the guiding mechanisms behind plausible virtual acoustics for extended realities. Room acoustic memory was tested by assessing a person's ability to recognise sound samples, convolved with room impulse responses of everyday rooms presented in a preceding training session. To test a common assumption of detection theory, we conducted two listening tests using both a yes/no and a 2AFC paradigm. Results show that subjects can recognise different rooms above chance level, but even with relatively large differences between the rooms, the accuracy is low in general. Furthermore, the relation between the two test paradigms follows the prediction of detection theory when averaging over all participants , but less so for individual participants.Peer reviewe

    Fade-In Control for Feedback Delay Networks

    No full text
    In virtual acoustics, it is common to simulate the early part of a Room Impulse Response using approaches from geometrical acous-tics and the late part using Feedback Delay Networks (FDNs). In order to transition from the early to the late part, it is useful to slowly fade-in the FDN response. We propose two methods to con-trol the fade-in, one based on double decays and the other based on modal beating. We use modal analysis to explain the two con-cepts for incorporating this fade-in behaviour entirely within the IIR structure of a multiple input multiple output FDN. We present design equations, which allow for placing the fade-in time at an arbitrary point within its derived limit.Peer reviewe

    Clearly audible room acoustical differences may not reveal where you are in a room

    No full text
    openaire: EC/H2020/812719/EU//VRACEA common aim in virtual reality room acoustics simulation is accurate listener position dependent rendering. However, it is unclear whether a mismatch between the acoustics and visual representation of a room influences the experience or is even noticeable. Here, we ask if listeners without any special experience in echolocation are able to identify their position in a room based on the acoustics alone. In a first test, direct comparison between acoustic recordings from the different positions in the room revealed clearly audible differences, which subjects described with various acoustic attributes. The design of the subsequent experiment allows participants to move around and explore the sound within different zones in this room while switching between visual renderings of the zones in a head-mounted display. The results show that identification was only possible in some special cases. In about 74% of all trials, listeners were not able to determine where they were in the room. The results imply that audible position dependent room acoustic rendering in virtual reality may not be noticeable under certain conditions, which highlights the importance of evaluation paradigm choice when assessing virtual acoustics.Peer reviewe
    corecore